-
-
Notifications
You must be signed in to change notification settings - Fork 320
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(runtime): Add series implementation for event recorder #1655
base: main
Are you sure you want to change the base?
feat(runtime): Add series implementation for event recorder #1655
Conversation
adeabe4
to
e09e462
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hey, thanks for this! looks largely sensible to me, gave a quick go over with some quick questions and minor nits, lmk if my thinking makes sense.
kube-runtime/src/events.rs
Outdated
cache.retain(|_, v| { | ||
if let Some(series) = v.series.as_ref() { | ||
series.last_observed_time.0 + EVENT_FINISH_TIME > now | ||
} else if let Some(event_time) = v.event_time.as_ref() { | ||
event_time.0 + EVENT_FINISH_TIME > now | ||
} else { | ||
true | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment suggestion for why there's a retain at the end of a publisher
cache.retain(|_, v| { | |
if let Some(series) = v.series.as_ref() { | |
series.last_observed_time.0 + EVENT_FINISH_TIME > now | |
} else if let Some(event_time) = v.event_time.as_ref() { | |
event_time.0 + EVENT_FINISH_TIME > now | |
} else { | |
true | |
} | |
// gc past events older than now + CACHE_TTL | |
cache.retain(|_, v| { | |
if let Some(series) = v.series.as_ref() { | |
series.last_observed_time.0 + CACHE_TTL > now | |
} else if let Some(event_time) = v.event_time.as_ref() { | |
event_time.0 + CACHE_TTL > now | |
} else { | |
true | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I put it there because I don't know where else to put it. We have to think of a better place for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I realize that in the end doesn't make sense if you have to invalidate a previous event and nothing more was executed. I changed it at the beginning but I would like to run it in a better place and I don't know where.
Also, I added a test in d85f318 which I don't like so much but I didn't want to change the API and add a clock to the structure for mocking the time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, this is the event_recorder_cache_retain
test? yeah, that feels a test that is too low (testing internals rather than interface). it would probably be more readable with a call to publish twice without all the internals.
you could perhaps make a mocking time/duration field conditional on #[cfg(test)]
to avoid mucking with the interface. though it's private anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will take a look when I have time. Thanks for the advice.
secondary: None, | ||
}, | ||
&s.object_ref(&()), | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we are changing .publish, we should make the integration test also publish twice to the same name and verify we get an EventSeries
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree. I will try to add them soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed on 07145ab
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is actually tested still. We should have an actual event series, with a count
of 2 if we publish twice, and i tried this in controller-rs with this branch and am not seeing event aggregation from kubectl describe.
maybe i am doing something wrong though. have outlined my steps in kube-rs/controller-rs#116 and set it up against your branch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah. i'm guessing this is because the event key contains an ObjectReference
for equality, and objectref contains a resourceVersion
meaning the ones i am changing are actually "not the same object", and would not be aggregated.
not sure how i feel about that. in my mind the aggregation ought to be unique per GVK, but maybe that's not the standard way of doing it in kubernetes land 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed we are not using resourceVersion
in Hash.
impl Hash for Reference {
fn hash<H: Hasher>(&self, state: &mut H) {
self.0.api_version.hash(state);
self.0.kind.hash(state);
self.0.name.hash(state);
self.0.namespace.hash(state);
self.0.uid.hash(state);
}
}
I checked your code and saw that the recorder is initialized each reconcile cycle, so your cache is empty. I initialized it in the context creation.
On 07145ab we are testing to send it twice and check that series
exists. We could add more checks or cases if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahhh. that explains the changing of the oref argument as well. changing this to create it on the context creates the correct results;
Normal HideRequested 56s (x2 over 75s) doc-controller Hiding `samuel`
79074f2
to
e8e4b54
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1655 +/- ##
=======================================
+ Coverage 75.6% 75.9% +0.3%
=======================================
Files 82 82
Lines 7405 7496 +91
=======================================
+ Hits 5591 5682 +91
Misses 1814 1814
|
41756e8
to
70a69af
Compare
Signed-off-by: Alexander Gil <[email protected]>
Signed-off-by: Alexander Gil <[email protected]>
Following the same pattern than the go-client. Signed-off-by: Alexander Gil <[email protected]>
Signed-off-by: Alexander Gil <[email protected]>
Signed-off-by: Alexander Gil <[email protected]>
Signed-off-by: Alexander Gil <[email protected]>
Signed-off-by: Alexander Gil <[email protected]>
70a69af
to
38ed567
Compare
Signed-off-by: Alexander Gil <[email protected]>
96240f4
to
dd650b2
Compare
Now all tests were passed but I forgot about DCO again... You can review it when you have time and if more improvements are needed I will work on them. If not I have to squash all those commits and we can merge. |
for kube-rs/kube#1655 Signed-off-by: clux <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the cleanups and sorry for the delay, have had a busy week. a few more small comments.
@@ -115,9 +158,10 @@ impl From<String> for Reporter { | |||
|
|||
impl From<&str> for Reporter { | |||
fn from(es: &str) -> Self { | |||
let instance = hostname::get().ok().and_then(|h| h.into_string().ok()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
had missed this earlier; how does this behave in-cluster in kubernetes? would this take the node
name? presumably you are doing this for ambiguation of events in the case of multiple replicas (a not super common setup for controllers afaik).
it seems sensible enough to do, but maybe it's better to not do this implicitly and let people use the downward api to pass in the pod name instead of needing another dependency to get a less useful node name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I had the same concerns as you have. I decided to go with this because hostname crate is really small and I don't expect it to change, and also because client-go
behaves in this way.
In Kubernetes, it takes the pod name as Hostname and this is "useful" because even with one replica if you recreate the pod you can differentiate them.
We can remove it from here, I don't have a strong opinion here. Furthermore, it is pretty easy to initiate it as you prefer in your controller.
secondary: None, | ||
}, | ||
&s.object_ref(&()), | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is actually tested still. We should have an actual event series, with a count
of 2 if we publish twice, and i tried this in controller-rs with this branch and am not seeing event aggregation from kubectl describe.
maybe i am doing something wrong though. have outlined my steps in kube-rs/controller-rs#116 and set it up against your branch
Signed-off-by: Alexander Gil <[email protected]>
Motivation
The current implementation of the event recorder just creates new events which is not enough for my event handling expectations.
Solution
I added a recorder implementation more similar to the one in client-go library. This implementation caches events in local and groups isomorphic events to increment series count on similar events.